CoZo+ - A Content Zoning Engine for textual documents
نویسندگان
چکیده
Content zoning can be understood as a segmentation of textual documents into zones. This is inspired by [6] who initially proposed an approach for the argumentative zoning of textual documents. With the prototypical Cozo+ engine, we focus on content zoning towards an automatic processing of textual streams while considering only the actors as the zones. We gain information that can be used to realize an automatic recognition of content for pre-defined actors. We understand Cozo+ as a necessary pre-step towards an automatic generation of summaries and to make intellectual ownership of documents detectable.
منابع مشابه
Muma: a Music Search Engine Based on Content Analysis
Existing music search engines are often limited to the textual modality (i.e., searching the textual metadata that are attached to music documents). We introduce here MUMA (http://muma.labs.exalead.com), a new search engine that relies both on textual metadata and signal processing metadata. MUMA allows the user to search for particular chords sequences, for specific moods, and to listen to aut...
متن کاملTemplate for Regular Entry
DEFINITION The widespread search engines, in the professional as well as the personal context, used to work on the basis of textual information associated or extracted from indexed documents. Nowadays, most of the exchanged or stored documents have multimedia content. To reduce the technological gap so that these engines still can work on multimedia content, it is very convenient developing met...
متن کاملA Classification Model for Mining Research Publications from Crowdsourced Data
Automatic access of natural language meaning is a prominent way of implementing search engines for document classification. The technique is difficult and often presents search results in rough approximates. It has minimal linguistic processing performed to identify content words like nouns and verbs in indexed documents. However, word frequency in documents can be taken as clues to their simil...
متن کاملRanking Techniques for Cluster Based Search Results in a Textual Knowledge-base
This paper presents a framework and methodology to improve the search experience in digital library systems. The approach taken is to cluster a textual knowledgebase along multiple relations and return search results in the form of small, focused clusters. Specifically, we generate multiple relationship networks, one per relationship type, and then cluster these networks. At search time, we pre...
متن کاملThe PROBADO-Framework: Content-Based Queries for non-textual Documents
In this paper we describe the system architecture of PROBADO, a project funded by the German Research Foundation (DFG). Its main goal is to provide a general library infrastructure for dealing with non-textual documents, in particular for content-based searching. PROBADO provides an infrastructure that allows integrating existing data repositories and content-based search engines into one commo...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/0811.0453 شماره
صفحات -
تاریخ انتشار 2008